POKer: a Partial Order Kernel for Comparing Strings with Alternative Substrings
نویسنده
چکیده
We introduce a Partial Order Kernel (POKer) on the weighted sum of local alignment scores that can be used for comparison and classification of strings containing alternative substrings of variable length. POKer is defined over the product of two directed acyclic graphs, each representing a string with alternative substrings, and is computed efficiently using dynamic programming. We evaluate the performance of POKer with Support Vector Machines on a dataset of strings generated by detecting overlapping motifs in a set of simulated DNA sequences. Compared to a generalization of a state-of-the-art string kernel, POKer achieves a higher classification accuracy.
منابع مشابه
String Subsequence Kernels for Text Classification
This paper explores the string subsequence kernel, a kernel function whose feature space is generated by subsequences of strings. This kernel compares two strings based on the number of occurrences of common substrings they contain, where each common substring is weighted based on how contiguous that substring is within the string. Although a recursive definition of the string subsequence kerne...
متن کاملA Compact Scheme for a Partial Integro-Differential Equation with Weakly Singular Kernel
Compact finite difference scheme is applied for a partial integro-differential equation with a weakly singular kernel. The product trapezoidal method is applied for discretization of the integral term. The order of accuracy in space and time is , where . Stability and convergence in norm are discussed through energy method. Numerical examples are provided to confirm the theoretical prediction ...
متن کاملPosition-Aware String Kernels with Weighted Shifts and a General Framework to Apply String Kernels to Other Structured Data
In combination with efficient kernel-base learning machines such as Support Vector Machine (SVM), string kernels have proven to be significantly effective in a wide range of research areas (e.g. bioinformatics, text analysis, voice analysis). Many of the string kernels proposed so far take advantage of simpler kernels such as trivial comparison of characters and/or substrings, and are classifie...
متن کاملA Fast and Accurate Global Maximum Power Point Tracking Method for Solar Strings under Partial Shading Conditions
This paper presents a model-based approach for the global maximum power point (GMPP) tracking of solar strings under partial shading conditions. In the proposed method, the GMPP voltage is estimated without any need to solve numerically the implicit and nonlinear equations of the photovoltaic (PV) string model. In contrast to the existing methods in which first the locations of all the local pe...
متن کاملThe relationship between HRR-based similarity and similarity based on structural kernels
Work in machine learning on kernel-based methods over discrete structures, such as strings and trees, uses a variety of kernels to measure similarity between structures (Haussler, 1999, Collins and Duffy, 2002, Bod, 1998). For example, a kernel for strings could count the number of matching substrings, and kernel for trees could count the number of matching subtrees. A kernel is always a dot pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018